Add static to fmt string to get it put into rodata#1422
Merged
Conversation
Clang will only put DEBUG_PRINT fmt strings into rodata if they
are "static const char[]", we were missing the static. So instead
clang generates code to construct the fmt string on the stack which
is wasteful of instructions and stack space.
Before:
```
; DEBUG_PRINT("beam: no PerCPURecord found");
10: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
12: 61 11 00 00 00 00 00 00 w1 = *(u32 *)(r1 + 0x0)
; DEBUG_PRINT("beam: no PerCPURecord found");
13: 15 01 9d 03 00 00 00 00 if r1 == 0x0 goto +0x39d <LBB0_107>
14: b7 01 00 00 75 6e 64 00 r1 = 0x646e75
; DEBUG_PRINT("beam: no PerCPURecord found");
15: 63 1a 78 ff 00 00 00 00 *(u32 *)(r10 - 0x88) = w1
16: 18 01 00 00 65 63 6f 72 00 00 00 00 64 20 66 6f r1 = 0x6f662064726f6365 ll
18: 7b 1a 70 ff 00 00 00 00 *(u64 *)(r10 - 0x90) = r1
19: 18 01 00 00 20 50 65 72 00 00 00 00 43 50 55 52 r1 = 0x5255504372655020 ll
21: 7b 1a 68 ff 00 00 00 00 *(u64 *)(r10 - 0x98) = r1
22: 18 01 00 00 62 65 61 6d 00 00 00 00 3a 20 6e 6f r1 = 0x6f6e203a6d616562 ll
24: 7b 1a 60 ff 00 00 00 00 *(u64 *)(r10 - 0xa0) = r1
25: bf a1 00 00 00 00 00 00 r1 = r10
26: 07 01 00 00 60 ff ff ff r1 += -0xa0
27: b7 02 00 00 1c 00 00 00 r2 = 0x1c
28: 85 00 00 00 06 00 00 00 call 0x6
29: 05 00 8d 03 00 00 00 00 goto +0x38d <LBB0_107>
```
After:
```
; DEBUG_PRINT("beam: no PerCPURecord found");
10: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
12: 61 11 00 00 00 00 00 00 w1 = *(u32 *)(r1 + 0x0)
; DEBUG_PRINT("beam: no PerCPURecord found");
13: 15 01 11 02 00 00 00 00 if r1 == 0x0 goto +0x211 <LBB0_100>
14: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
16: b7 02 00 00 1c 00 00 00 r2 = 0x1c
17: 85 00 00 00 06 00 00 00 call 0x6
18: 05 00 0c 02 00 00 00 00 goto +0x20c <LBB0_100>
```
Contributor
Author
|
Not sure if there was a reason why we were avoiding rodata for fmt strings, un-happy accident I suppose. Instruction count: before vs. after static const char[] fmt strings |
christos68k
approved these changes
May 15, 2026
gnurizen
added a commit
to parca-dev/opentelemetry-ebpf-profiler
that referenced
this pull request
May 26, 2026
gnurizen
added a commit
to parca-dev/opentelemetry-ebpf-profiler
that referenced
this pull request
May 26, 2026
…imit After kernel-bump open-telemetry#1310 removed the trailing "\n" from the printt ____fmt[] declaration, the recompiled BPF program cuda_activity_batch_tail exceeds the 6.16 verifier's path-tracking complexity budget and is rejected with "argument list too long". The cuda.ebpf.c source did not change, but the compiler's inlining/path-analysis decisions shifted. Making ____fmt[] static puts the per-call-site format strings in .rodata instead of materializing them on the BPF stack at every callsite, which substantially reduces per-program complexity (instruction counts: amd64 62826 -> 44954, arm64 71875 -> 53960). Source change ported from upstream PR open-telemetry#1422 (commit 7796e25). Rebuilt binaries locally to match the current source tree.
This was referenced May 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Clang will only put DEBUG_PRINT fmt strings into rodata if they
are "static const char[]", we were missing the static. So instead
clang generates code to construct the fmt string on the stack which
is wasteful of instructions and stack space.
Before:
After: